Turnbull China Bikeride

home *** CD-ROM | disk | FTP | other *** search

/ Turnbull China Bikeride / Turnbull China Bikeride - Disc 2.iso / STUTTGART / TEMP / GNU / flex / Options < prev next >

Wrap

Text File | 1995-06-28 | 21KB | 634 lines

Options Previous: <YACC interface=>YACCinterf> * Next: <Performance=>Performanc> * Up: <Top=>!Root> #Wrap on {fH3}Options{f} {fCode}flex{f} has the following options: #Indent +4 #Indent {fEmphasis}-b{f} #Indent +4 Generate backing-up information to {fCite}lex.backup{f}. This is a list of scanner states which require backing up and the input characters on which they do so. By adding rules one can remove backing-up states. If {fEmphasis}all{f} backing-up states are eliminated and {fEmphasis}-Cf{f} or {fEmphasis}-CF{f} is used, the generated scanner will run faster (see the {fEmphasis}-p{f} flag). Only users who wish to squeeze every last cycle out of their scanners need worry about this option. (See the section on Performance Considerations below.) #Indent {fEmphasis}-c{f} #Indent +4 is a do-nothing, deprecated option included for POSIX compliance. #Indent {fEmphasis}-d{f} #Indent +4 makes the generated scanner run in {fUnderline}debug{f} mode. Whenever a pattern is recognized and the global {fCode}yy\_flex\_debug{f} is non-zero (which is the default), the scanner will write to {fCode}stderr{f} a line of the form: #Wrap off #fCode --accepting rule at line 53 ("the matched text") #f #Wrap on The line number refers to the location of the rule in the file defining the scanner (i.e., the file that was fed to flex). Messages are also generated when the scanner backs up, accepts the default rule, reaches the end of its input buffer (or encounters a NUL; at this point, the two look the same as far as the scanner's concerned), or reaches an end-of-file. #Indent {fEmphasis}-f{f} #Indent +4 specifies {fUnderline}fast scanner{f}. No table compression is done and stdio is bypassed. The result is large but fast. This option is equivalent to {fEmphasis}-Cfr{f} (see below). #Indent {fEmphasis}-h{f} #Indent +4 generates a "help" summary of {fCode}flex's{f} options to {fCode}stdout{f} and then exits. {fEmphasis}-?{f} and {fEmphasis}--help{f} are synonyms for {fEmphasis}-h{f}. #Indent {fEmphasis}-i{f} #Indent +4 instructs {fCode}flex{f} to generate a {fEmphasis}case-insensitive{f} scanner. The case of letters given in the {fCode}flex{f} input patterns will be ignored, and tokens in the input will be matched regardless of case. The matched text given in {fCode}yytext{f} will have the preserved case (i.e., it will not be folded). #Indent {fEmphasis}-l{f} #Indent +4 turns on maximum compatibility with the original AT&T {fCode}lex{f} implementation. Note that this does not mean {fEmphasis}full{f} compatibility. Use of this option costs a considerable amount of performance, and it cannot be used with the {fEmphasis}-+, -f, -F, -Cf{f}, or {fEmphasis}-CF{f} options. For details on the compatibilities it provides, see the section "Incompatibilities With Lex And POSIX" below. This option also results in the name {fCode}YY\_FLEX\_LEX\_COMPAT{f} being \#define'd in the generated scanner. #Indent {fEmphasis}-n{f} #Indent +4 is another do-nothing, deprecated option included only for POSIX compliance. #Indent {fEmphasis}-p{f} #Indent +4 generates a performance report to stderr. The report consists of comments regarding features of the {fCode}flex{f} input file which will cause a serious loss of performance in the resulting scanner. If you give the flag twice, you will also get comments regarding features that lead to minor performance losses. Note that the use of {fCode}REJECT{f}, {fEmphasis}%option yylineno{f} and variable trailing context (see the Deficiencies \/ Bugs section below) entails a substantial performance penalty; use of {fEmphasis}yymore(){f}, the {fEmphasis}^{f} operator, and the {fEmphasis}-I{f} flag entail minor performance penalties. #Indent {fEmphasis}-s{f} #Indent +4 causes the {fUnderline}default rule{f} (that unmatched scanner input is echoed to {fCode}stdout{f}) to be suppressed. If the scanner encounters input that does not match any of its rules, it aborts with an error. This option is useful for finding holes in a scanner's rule set. #Indent {fEmphasis}-t{f} #Indent +4 instructs {fCode}flex{f} to write the scanner it generates to standard output instead of {fCite}lex.yy.c{f}. #Indent {fEmphasis}-v{f} #Indent +4 specifies that {fCode}flex{f} should write to {fCode}stderr{f} a summary of statistics regarding the scanner it generates. Most of the statistics are meaningless to the casual {fCode}flex{f} user, but the first line identifies the version of {fCode}flex{f} (same as reported by {fEmphasis}-V{f}), and the next line the flags used when generating the scanner, including those that are on by default. #Indent {fEmphasis}-w{f} #Indent +4 suppresses warning messages. #Indent {fEmphasis}-B{f} #Indent +4 instructs {fCode}flex{f} to generate a {fEmphasis}batch{f} scanner, the opposite of {fEmphasis}interactive{f} scanners generated by {fEmphasis}-I{f} (see below). In general, you use {fEmphasis}-B{f} when you are {fEmphasis}certain{f} that your scanner will never be used interactively, and you want to squeeze a {fEmphasis}little{f} more performance out of it. If your goal is instead to squeeze out a {fEmphasis}lot{f} more performance, you should be using the {fEmphasis}-Cf{f} or {fEmphasis}-CF{f} options (discussed below), which turn on {fEmphasis}-B{f} automatically anyway. #Indent {fEmphasis}-F{f} #Indent +4 specifies that the {fUnderline}fast{f} scanner table representation should be used (and stdio bypassed). This representation is about as fast as the full table representation {fEmphasis}(-f){f}, and for some sets of patterns will be considerably smaller (and for others, larger). In general, if the pattern set contains both "keywords" and a catch-all, "identifier" rule, such as in the set: #Wrap off #fCode "case" return TOK\_CASE; "switch" return TOK\_SWITCH; ... "default" return TOK\_DEFAULT; [a-z]+ return TOK\_ID; #f #Wrap on then you're better off using the full table representation. If only the "identifier" rule is present and you then use a hash table or some such to detect the keywords, you're better off using {fEmphasis}-F{f}. This option is equivalent to {fEmphasis}-CFr{f} (see below). It cannot be used with {fEmphasis}-+{f}. #Indent {fEmphasis}-I{f} #Indent +4 instructs {fCode}flex{f} to generate an {fEmphasis}interactive{f} scanner. An interactive scanner is one that only looks ahead to decide what token has been matched if it absolutely must. It turns out that always looking one extra character ahead, even if the scanner has already seen enough text to disambiguate the current token, is a bit faster than only looking ahead when necessary. But scanners that always look ahead give dreadful interactive performance; for example, when a user types a newline, it is not recognized as a newline token until they enter {fEmphasis}another{f} token, which often means typing in another whole line. {fCode}Flex{f} scanners default to {fEmphasis}interactive{f} unless you use the {fEmphasis}-Cf{f} or {fEmphasis}-CF{f} table-compression options (see below). That's because if you're looking for high-performance you should be using one of these options, so if you didn't, {fCode}flex{f} assumes you'd rather trade off a bit of run-time performance for intuitive interactive behavior. Note also that you {fEmphasis}cannot{f} use {fEmphasis}-I{f} in conjunction with {fEmphasis}-Cf{f} or {fEmphasis}-CF{f}. Thus, this option is not really needed; it is on by default for all those cases in which it is allowed. You can force a scanner to {fEmphasis}not{f} be interactive by using {fEmphasis}-B{f} (see above). #Indent {fEmphasis}-L{f} #Indent +4 instructs {fCode}flex{f} not to generate {fEmphasis}\#line{f} directives. Without this option, {fCode}flex{f} peppers the generated scanner with \#line directives so error messages in the actions will be correctly located with respect to either the original {fCode}flex{f} input file (if the errors are due to code in the input file), or {fCite}lex.yy.c{f} (if the errors are {fCode}flex's{f} fault -- you should report these sorts of errors to the email address given below). #Indent {fEmphasis}-T{f} #Indent +4 makes {fCode}flex{f} run in {fCode}trace{f} mode. It will generate a lot of messages to {fCode}stderr{f} concerning the form of the input and the resultant non-deterministic and deterministic finite automata. This option is mostly for use in maintaining {fCode}flex{f}. #Indent {fEmphasis}-V{f} #Indent +4 prints the version number to {fCode}stdout{f} and exits. {fEmphasis}--version{f} is a synonym for {fEmphasis}-V{f}. #Indent {fEmphasis}-7{f} #Indent +4 instructs {fCode}flex{f} to generate a 7-bit scanner, i.e., one which can only recognized 7-bit characters in its input. The advantage of using {fEmphasis}-7{f} is that the scanner's tables can be up to half the size of those generated using the {fEmphasis}-8{f} option (see below). The disadvantage is that such scanners often hang or crash if their input contains an 8-bit character. Note, however, that unless you generate your scanner using the {fEmphasis}-Cf{f} or {fEmphasis}-CF{f} table compression options, use of {fEmphasis}-7{f} will save only a small amount of table space, and make your scanner considerably less portable. {fCode}Flex's{f} default behavior is to generate an 8-bit scanner unless you use the {fEmphasis}-Cf{f} or {fEmphasis}-CF{f}, in which case {fCode}flex{f} defaults to generating 7-bit scanners unless your site was always configured to generate 8-bit scanners (as will often be the case with non-USA sites). You can tell whether flex generated a 7-bit or an 8-bit scanner by inspecting the flag summary in the {fEmphasis}-v{f} output as described above. Note that if you use {fEmphasis}-Cfe{f} or {fEmphasis}-CFe{f} (those table compression options, but also using equivalence classes as discussed see below), flex still defaults to generating an 8-bit scanner, since usually with these compression options full 8-bit tables are not much more expensive than 7-bit tables. #Indent {fEmphasis}-8{f} #Indent +4 instructs {fCode}flex{f} to generate an 8-bit scanner, i.e., one which can recognize 8-bit characters. This flag is only needed for scanners generated using {fEmphasis}-Cf{f} or {fEmphasis}-CF{f}, as otherwise flex defaults to generating an 8-bit scanner anyway. See the discussion of {fEmphasis}-7{f} above for flex's default behavior and the tradeoffs between 7-bit and 8-bit scanners. #Indent {fEmphasis}-+{f} #Indent +4 specifies that you want flex to generate a C++ scanner class. See the section on Generating C++ Scanners below for details. #Indent {fEmphasis}-C[aefFmr]{f} #Indent +4 controls the degree of table compression and, more generally, trade-offs between small scanners and fast scanners. {fEmphasis}-Ca{f} ("align") instructs flex to trade off larger tables in the generated scanner for faster performance because the elements of the tables are better aligned for memory access and computation. On some RISC architectures, fetching and manipulating long-words is more efficient than with smaller-sized units such as shortwords. This option can double the size of the tables used by your scanner. {fEmphasis}-Ce{f} directs {fCode}flex{f} to construct {fUnderline}equivalence classes{f}, i.e., sets of characters which have identical lexical properties (for example, if the only appearance of digits in the {fCode}flex{f} input is in the character class "[0-9]" then the digits '0', '1', …, '9' will all be put in the same equivalence class). Equivalence classes usually give dramatic reductions in the final table\/object file sizes (typically a factor of 2-5) and are pretty cheap performance-wise (one array look-up per character scanned). {fEmphasis}-Cf{f} specifies that the {fEmphasis}full{f} scanner tables should be generated - {fCode}flex{f} should not compress the tables by taking advantages of similar transition functions for different states. {fEmphasis}-CF{f} specifies that the alternate fast scanner representation (described above under the {fEmphasis}-F{f} flag) should be used. This option cannot be used with {fEmphasis}-+{f}. {fEmphasis}-Cm{f} directs {fCode}flex{f} to construct {fUnderline}meta-equivalence classes{f}, which are sets of equivalence classes (or characters, if equivalence classes are not being used) that are commonly used together. Meta-equivalence classes are often a big win when using compressed tables, but they have a moderate performance impact (one or two "if" tests and one array look-up per character scanned). {fEmphasis}-Cr{f} causes the generated scanner to {fEmphasis}bypass{f} use of the standard I\/O library (stdio) for input. Instead of calling {fEmphasis}fread(){f} or {fEmphasis}getc(){f}, the scanner will use the {fEmphasis}read(){f} system call, resulting in a performance gain which varies from system to system, but in general is probably negligible unless you are also using {fEmphasis}-Cf{f} or {fEmphasis}-CF{f}. Using {fEmphasis}-Cr{f} can cause strange behavior if, for example, you read from {fCode}yyin{f} using stdio prior to calling the scanner (because the scanner will miss whatever text your previous reads left in the stdio input buffer). {fEmphasis}-Cr{f} has no effect if you define {fCode}YY\_INPUT{f} (see The Generated Scanner above). A lone {fEmphasis}-C{f} specifies that the scanner tables should be compressed but neither equivalence classes nor meta-equivalence classes should be used. The options {fEmphasis}-Cf{f} or {fEmphasis}-CF{f} and {fEmphasis}-Cm{f} do not make sense together - there is no opportunity for meta-equivalence classes if the table is not being compressed. Otherwise the options may be freely mixed, and are cumulative. The default setting is {fEmphasis}-Cem{f}, which specifies that {fCode}flex{f} should generate equivalence classes and meta-equivalence classes. This setting provides the highest degree of table compression. You can trade off faster-executing scanners at the cost of larger tables with the following generally being true: #Wrap off #fCode slowest & smallest -Cem -Cm -Ce -C -C\{f,F\}e -C\{f,F\} -C\{f,F\}a fastest & largest #f #Wrap on Note that scanners with the smallest tables are usually generated and compiled the quickest, so during development you will usually want to use the default, maximal compression. {fEmphasis}-Cfe{f} is often a good compromise between speed and size for production scanners. #Indent {fEmphasis}-ooutput{f} #Indent +4 directs flex to write the scanner to the file {fEmphasis}out-{f} {fCode}put{f} instead of {fCite}lex.yy.c{f}. If you combine {fEmphasis}-o{f} with the {fEmphasis}-t{f} option, then the scanner is written to {fCode}stdout{f} but its {fEmphasis}\#line{f} directives (see the {fEmphasis}-L{f} option above) refer to the file {fCode}output{f}. #Indent {fEmphasis}-Pprefix{f} #Indent +4 changes the default {fEmphasis}yy{f} prefix used by {fCode}flex{f} for all globally-visible variable and function names to instead be {fStrong}prefix{f}. For example, {fEmphasis}-Pfoo{f} changes the name of {fCode}yytext{f} to {fCite}footext{f}. It also changes the name of the default output file from {fCite}lex.yy.c{f} to {fCite}lex.foo.c{f}. Here are all of the names affected: #Wrap off #fCode yy\_create\_buffer yy\_delete\_buffer yy\_flex\_debug yy\_init\_buffer yy\_flush\_buffer yy\_load\_buffer\_state yy\_switch\_to\_buffer yyin yyleng yylex yylineno yyout yyrestart yytext yywrap #f #Wrap on (If you are using a C++ scanner, then only {fCode}yywrap{f} and {fCode}yyFlexLexer{f} are affected.) Within your scanner itself, you can still refer to the global variables and functions using either version of their name; but externally, they have the modified name. This option lets you easily link together multiple {fCode}flex{f} programs into the same executable. Note, though, that using this option also renames {fEmphasis}yywrap(){f}, so you now {fEmphasis}must{f} either provide your own (appropriately-named) version of the routine for your scanner, or use {fEmphasis}%option noyywrap{f}, as linking with {fEmphasis}-lfl{f} no longer provides one for you by default. #Indent {fEmphasis}-Sskeleton\_file{f} #Indent +4 overrides the default skeleton file from which {fCode}flex{f} constructs its scanners. You'll never need this option unless you are doing {fCode}flex{f} maintenance or development. #Indent {fCode}flex{f} also provides a mechanism for controlling options within the scanner specification itself, rather than from the flex command-line. This is done by including {fEmphasis}%option{f} directives in the first section of the scanner specification. You can specify multiple options with a single {fEmphasis}%option{f} directive, and multiple directives in the first section of your flex input file. Most options are given simply as names, optionally preceded by the word "no" (with no intervening whitespace) to negate their meaning. A number are equivalent to flex flags or their negation: #Wrap off #fCode 7bit -7 option 8bit -8 option align -Ca option backup -b option batch -B option c++ -+ option caseful or case-sensitive opposite of -i (default) case-insensitive or caseless -i option debug -d option default opposite of -s option ecs -Ce option fast -F option full -f option interactive -I option lex-compat -l option meta-ecs -Cm option perf-report -p option read -Cr option stdout -t option verbose -v option warn opposite of -w option (use "%option nowarn" for -w) array equivalent to "%array" pointer equivalent to "%pointer" (default) #f #Wrap on Some {fEmphasis}%option's{f} provide features otherwise not available: #Indent +4 #Indent {fEmphasis}always-interactive{f} #Indent +4 instructs flex to generate a scanner which always considers its input "interactive". Normally, on each new input file the scanner calls {fEmphasis}isatty(){f} in an attempt to determine whether the scanner's input source is interactive and thus should be read a character at a time. When this option is used, however, then no such call is made. #Indent {fEmphasis}main{f} #Indent +4 directs flex to provide a default {fEmphasis}main(){f} program for the scanner, which simply calls {fEmphasis}yylex(){f}. This option implies {fCode}noyywrap{f} (see below). #Indent {fEmphasis}never-interactive{f} #Indent +4 instructs flex to generate a scanner which never considers its input "interactive" (again, no call made to {fEmphasis}isatty()){f}. This is the opposite of {fEmphasis}always-{f} {fEmphasis}interactive{f}. #Indent {fEmphasis}stack{f} #Indent +4 enables the use of start condition stacks (see Start Conditions above). #Indent {fEmphasis}stdinit{f} #Indent +4 if unset (i.e., {fEmphasis}%option nostdinit{f}) initializes {fCode}yyin{f} and {fCode}yyout{f} to nil {fCode}FILE{f} pointers, instead of {fCode}stdin{f} and {fCode}stdout{f}. #Indent {fEmphasis}yylineno{f} #Indent +4 directs {fCode}flex{f} to generate a scanner that maintains the number of the current line read from its input in the global variable {fCode}yylineno{f}. This option is implied by {fEmphasis}%option lex-compat{f}. #Indent {fEmphasis}yywrap{f} #Indent +4 if unset (i.e., {fEmphasis}%option noyywrap{f}), makes the scanner not call {fEmphasis}yywrap(){f} upon an end-of-file, but simply assume that there are no more files to scan (until the user points {fCode}yyin{f} at a new file and calls {fEmphasis}yylex(){f} again). #Indent {fCode}flex{f} scans your rule actions to determine whether you use the {fCode}REJECT{f} or {fEmphasis}yymore(){f} features. The {fCode}reject{f} and {fCode}yymore{f} options are available to override its decision as to whether you use the options, either by setting them (e.g., {fEmphasis}%option reject{f}) to indicate the feature is indeed used, or unsetting them to indicate it actually is not used (e.g., {fEmphasis}%option noyymore{f}). Three options take string-delimited values, offset with '=': #Wrap off #fCode %option outfile="ABC" #f #Wrap on is equivalent to {fEmphasis}-oABC{f}, and #Wrap off #fCode %option prefix="XYZ" #f #Wrap on is equivalent to {fEmphasis}-PXYZ{f}. Finally, #Wrap off #fCode %option yyclass="foo" #f #Wrap on only applies when generating a C++ scanner ({fEmphasis}-+{f} option). It informs {fCode}flex{f} that you have derived {fEmphasis}foo{f} as a subclass of {fCode}yyFlexLexer{f} so {fCode}flex{f} will place your actions in the member function {fEmphasis}foo::yylex(){f} instead of {fEmphasis}yyFlexLexer::yylex(){f}. It also generates a {fEmphasis}yyFlexLexer::yylex(){f} member function that emits a run-time error (by invoking {fEmphasis}yyFlexLexer::LexerError(){f}) if called. See Generating C++ Scanners, below, for additional information. A number of options are available for lint purists who want to suppress the appearance of unneeded routines in the generated scanner. Each of the following, if unset, results in the corresponding routine not appearing in the generated scanner: #Wrap off #fCode input, unput yy\_push\_state, yy\_pop\_state, yy\_top\_state yy\_scan\_buffer, yy\_scan\_bytes, yy\_scan\_string #f #Wrap on (though {fEmphasis}yy\_push\_state(){f} and friends won't appear anyway unless you use {fEmphasis}%option stack{f}).